dgit.raspbian.org Git - utf8proc.git/commit

author	Benito van der Zander <benito@benibela.de>
	Tue, 12 Jul 2016 15:51:50 +0000 (17:51 +0200)
committer	Steven G. Johnson <stevenj@mit.edu>
	Tue, 12 Jul 2016 15:51:50 +0000 (11:51 -0400)
commit	eeebf70bcf68443b0b2e5b3d811227ed3f039ea4
tree	a4815a783b88588a2aa30b6a396eebfe1320e250	tree \| snapshot
parent	9a0b87b57ec0be5bdf8baa7d53a4dfeb940d07d8	commit \| diff

Smaller tables (#68)

* convert sequences to utf-16 (saves 25kb)

* store sequence length in properties instead using -1 termination (saves 10kb)

* cache index for slightly faster data creation

* store lower/upper/title mapping in sequence array (saves 25kb). Add utf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time

* change combination array data type to uint16 (saves 40kb)

* merge 1st and 2nd comb index (saves 50kb)

* kill empty prefix/suffix in combination array (saves 50kb)

* there was no need to have a separate combination start array, it can be merged in a single array

* some fixes

* mark the table as const again

* and regen

data/data_generator.rb		diff \| blob \| history
test/printproperty.c		diff \| blob \| history
utf8proc.c		diff \| blob \| history
utf8proc.h		diff \| blob \| history
utf8proc_data.c		diff \| blob \| history